Below is the mapping of datasets used in the provided code and the variables accessed or manipulated from each dataset:


1. **`vote_zip_crime_w`**
This dataset is identical to the main exit poll dataset, with weights and crime rate data.
   - Variables:
     - `out2`: the outcome of the experiment portion of Study 1 (Figure 2). 
     - `out_vote`: the main outcome variable - voting on the recall election. Used in filtering conditions and as the dependent variable in regression models.
     - `crime_rate_2020`: Used as an independent variable in regression models (`log(crime_rate_2020)`).
     - `crime_diff_log`: Used as an independent variable in regression models.
     - `sal`: salience of crime scale. Used as an independent variable in regression models.
     - `redm`: redeemability scale. Used as an independent variable in regression models.
     - `puni_t`: punitivism scale. Used as an independent variable in regression models.
     - `prog`: progressive attitudes scale. Used as an independent variable in regression models.
     - `victim`: Used as an independent variable in regression models.
     - `know`: knowledge of criminal justice politics. Used as an independent variable in regression models.
     - `rr`: racial resentment. Used as an independent variable in regression models.
     - `sym`: racial sympathy. Used as an independent variable in regression models.
     - `gndr`, `idgy`, `race`, `pid`, `home`, `incm`, `age_new`, `edu`: Used as demographic controls in regression models.
     - `raking`: Used as weights in regression models.
     - `response_id`: Used as a clustering variable in regression models.

2. **`lucid_edit`**
   - Variables:
     - `duration_in_seconds`: Used to filter out inattentive respondents.
     - `q389_1`: Used in filtering conditions for inattentive respondents.
     - Variables renamed for further use:
       - `age` → `Age`
       - `gender` → `Gender`
       - `idgy` → `Political_Ideology`
       - `political_party` → `Partisanship`
       - `ethnicity` → `Ethnicity`
       - `home` → `Home_owner`
       - `education` → `Education`
       - `hhi` → `Income`
     - `group1`, `out1`: the main outcome and treatment indication variables for Study 3. Used in analysis and predictions (e.g., RF models and t-tests).
   - Derived Variables:
     - `out_vote_rf`: Created by applying predictions from a Random Forest model.
     - New variables created:
       - Recoded categorical variables (`Gender`, `Partisanship`, `Ethnicity`, `Education`, `Income`) for summarization and tables.

**`lucid_ps_match`** - identical to the lucide_edit dataset, but used for the propensity score analysis.

**`text` (Created from `lucid_edit`)**
   - Variables:
     - `why_1`, `why_2`, `why_3`: Combined into a single variable (`why`).
     - `group1`, `out1`: Used in filtering conditions for subsets (`opp_less`, `opp_diff`, `opp_more`).
   - Tokenized and processed for text analysis.

3. **`da_edit`**
   - Variables:
     - `out_vote`: Used to create subsets (`da_recall`, `da_no`) and as the dependent variable in regression models - the main outcome: voting on the recall election.
     - `zip`: voter's zip code.
     - `prog1`, `prog2`: progressive attitudes scale. Used to classify observations as progressive (`da_prog`) or non-progressive (`da_no_prog`).
     - `prog`: the combined scale. 
     - `group1`, `group2`: treatment assignment in Study 1 and 2. Used in models and visualization.
     - `out2`: Used in models and visualization (policy support - experiment in Study 1).
     - `response_id`: Used as a clustering variable in models.


4. **`igs_edit`** - used only to create figure C.1 (face validity to the progressive attitudes scale).
   - Variables:
     - `prog`: Used to create a subset (`igs_prog`).
     - `county`: Used to create subsets based on county names (`igs_la` and others for plotting).
     - `prog`: Used in summarization and plotting by county.


5. **`igs_oct_23`**
   - Variables:
     - `Q3`: Used to create the `q3_coded` variable.
Q3
"Which of the following four statements about how the government should approach public safety comes closest to your views?"
1  "more extensive, more intensive"
2  "more extensive, less intensive"
3  "less extensive, more intensive"
4  "less extensive, less intensive"

     - `w1`: Used to calculate weighted proportions (`table1_weighted_all` and `table1_prop_weighted_all`).
     - `region`: Used to create a subset for "BAY" (`igs_bay`).
     - `CNTY`: Used to create a subset for "SF" (`igs_sf`).


6. **`highlight` (Subset of `da_recall` -  only "conflicted progressives")**
   - Variables:
     - `out_vote`, `out1`, `group1`: Used for analysis and plotting.

7. **Other Derived Subsets and Variables:**
   - **`da_prog`:** Subset of `da_edit` where voters are progressive.
   - **`da_no_prog`:** Subset of `da_edit` for non-progressives.
   - **`igs_bay`:** Subset of `igs_oct_23` where `region == 1`.
   - **`igs_sf`:** Subset of `igs_oct_23` where `CNTY == 38`.
   - **`highlight`:** Subset of `da_recall` for conflicted progressives.
   - **`prog_norecall`:** Progressives who opposed the recall.
   - **`no_prog_norecall`:** Non-progressives who opposed the recall.

-